Bulletin of Electrical Engineering and Informatics 
Vol. 9, No. 5, October 2020, pp. 2189~2197 
ISSN: 2302-9285, DOI: 10.11591/ee1.v915.2622 O 2189 


Blob adaptation through frames analysis for 
dynamic fire detection 


Indrabayu’, Rahmat Hardian Putra’, Ingrid Nurtanio’, Intan Sari Areni’, Anugrayani Bustamin” 
ner Department of Informatics Engineering, Hasanuddin University, Indonesia 
“Department of Electrical Engineering, Hasanuddin University, Indonesia 


Article Info ABSTRACT 

Article history: This study was aiming at helping visually impaired people to detect 
and estimate the fire distance. Blind people had difficulty knowing 

Received Jan 17, 2020 the existence of fire at a safe distance; hence the possibility of burning could occur. 

Revised Mar 31, 2020 The color models and blob analysis methods were used to detect the presence 

Accepted May 2, 2020 of fire in the blind path. Before the fire detection stage, the cascade of 


the HSV and RGB color models was applied to segment the reddish fire 
color. The size and shape of a dynamic fire were the parameters used in this 
Keywords: paper to distinguish fire from non-fire objects. Changes in the area of the fire 
object obtained at the Blob analysis stage per 10 frames were the main 
contributions and novelty in this paper. After the fire is detected, 
the calculation of the fire distance to a blind person was completed using 


Blob analysis 
Computer vision 


Distance estimation a pinhole model. This research used 35 data videos with a resolution 
Fire detection of 480x640 pixels. The results showed that the fire detection system 
Pinhole model and the distance estimation achieved an accuracy of 88.86% and the MSE 


of 0.0358, respectively. 


This is an open access article under the CC BY-SA license. 


Corresponding Author: 


Indrabayu, 

Department of Informatics Engineering, 

Hasanuddin University, 

Jl. Poros Malino KM. 6, Bontomarannu, Gowa, South Sulawesi 92172, Indonesia. 
Email: indrabayu @unhas.ac.idw 


1. INTRODUCTION 

Blindness is a condition where a person experiences vision impairment to a certain degree 
depending upon many factors, such as illness, accident, or other cause. Blind people face many challenges, 
such as difficulty to interact even in daily activities. Activities performed by blind people can endanger their 
safety, particularly activities involving sharp objects and objects that can cause fires. Thus, blind people tend 
to need the help of others [1]. According to FEMA (Federal Emergency Management Agency), blind people 
have difficulties to feel the presence of fire at a safe distance, increasing the chance of being injured. 
Blind people find it hard to recognize the early signs of a fire. Therefore, even small fires can become a threat to 
the blind [1]. 

A new concept of smart sticks had been proposed to help blind people detect obstacles and even 
help their navigation using GPS. However, this smart stick is still relatively expensive [2]. Another research 
proposed a smart stick integrated with the ultrasonic, light, and water sensor [3-5]. Other various research 
also has been conducted to offer solutions to help blind people but rarely able to recognize fire as a danger. 
Recently, computer vision technology has also been implemented in applications and systems and intended to 
assist blind people. It has been used in a wide range of fields, especially in face recognition. Several previous 
research has proven that computer vision technology is reliable and accurate in recognizing the human 
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face [6-9]. Computer vision has shown notable results in recognizing non-solid objects as well. For instance, 
research implemented computer vision to detect fire and smoke using color features and color models. 
The block method is adopted to reduce error detection or false alarm caused by other objects because this 
method can recognize natural movement features of fire. From the predetermined scenario, this system 
produces 100% accuracy without the occurrence of false alarms [10]. 

Segmentation is one important process in image processing technology. Jaiswal et al. introduced a new 
method of image segmentation by minimizing the previous 3-dimensional segmentation into 2-dimensional only with 
genetic algorithms. The results obtained show clear segments between objects based on their color features [11]. 
Another segmentation process was also tested on a noisy eye image. The use of contour-based feature segmentation, 
including brightness, color, and texture, is the focus of this study. From these results, the CNN algorithm is used to 
cluster parts of the eye that will be further processed, such as iris, pupils, and sclera [12]. Research conducted by 
Mengistu also compares the performance of several segmentation techniques. The object of this research is Ethiopian 
coffee varieties. Otsu, Fuzzy C-means (FCM), and K-means methods as segmentation methods have been tested. 
The results of the segmentation are then classified with Backpropagation Neural Network (BPNN). The best 
performance was obtained at 94.54% using FCM as a segmentation method [13]. 

Moreover, Yadav et al. proposed a fire detection system using videos taken with ordinary cameras. 
This study combines several methods to detect fire, such as the color detection method using the RGB color 
model, motion detection method to detect the fire movement, and the gray cycle detection method to detect 
gray color cycles of smoke produced by the fire. Also, the area dispersion detection method is applied to 
detect the spreading of fire pixels from several frames. This new system simulates an existing fire detection 
technique by adding optimization to reduce false alarms and improve accuracy. The percentage of system 
performance offered is 92.31%, with a false alarm rate of 7.69% [14]. 

Research on distance estimation has also been conducted. Sadreddini et al. examined a distance 
calculation method using a single camera for indoor environments. In this method, the initial step is to extract 
the appropriate vertical floor line from the snapshots sequence. Moreover, the line passing through the base 
side of the object is found simultaneously with the floor line. Then, the part of the floor line starting from this 
point is measured in pixels, by taking the intersection point of these lines, and converted to distance between 
the camera and the object. This method can be applied to buildings with regular lines on the floor. The results 
of this study showed that the proposed method had a good ability to measure distances, but the greater 
the distance to be measured, the error between the distance calculated and the actual distance will also be greater. 
However, at a certain distance, the error will be constant [15]. 

Another method for distance estimation was proposed to estimate the distance between the colored 
object to the camera using image processing. First, image color filtering based on the HSV color model was 
performed to the image. Then, object detection was applied to label eight nearest connected components with 
the same value using the Moore-neighborhood tracing algorithm. The distance of the object was estimated 
using the single point projection principle of the object height in the image, represented by the bounding 
rectangle, and the actual object’s height. Finally, an experiment was conducted using objects with different 
colors and distance variation in centimeters. The results showed that the method could estimate the real 
distance of the colored object from the camera [16]. 

Furthermore, research conducted by Rahman et al. used a new method to calculate the distance from 
an object with only a single image. The correlation between real distance and object height in the image 
is used in the training process so that the correlation between the two parameters is obtained. These results 
are used to estimate the distance of the real object through the height of pixels in the image. The philosophy 
of this method comes from seafarers. They use this method when they need to determine their distance from 
the coastline using lighthouse projection. This proposed system reached an accuracy of 98.76% [17]. 

Abdurrasyid et al. [18] have researched barrier objects in blind people such as stairs, doors, and walls. The 
resulting output in the form of a warning in the form of sound data so that people with visual impairments can avoid 
objects obstructions during walking. Different from previous research, vehicle detection systems were also developed 
using the OTSU method and K-means clustering in 2019 and the merging of the R-CNN and Kalman Filter methods 
in 2020 [19-20]. The OTSU method is also still used for detection of human movement objects by Soeleman et al. by 
utilizing the adaptive threshold in the Gaussian Mixture algorithm [21]. 

This research proposes an Android-based system for detecting fire and measuring the distance to 
help blind people avoid the danger of being exposed to fire. Blob analysis and color models are used to detect 
fire based on the reddish color of the fire. However, many other objects can be detected as fire because 
the colors resemble fire, such as street lights or vehicle lights. Therefore, the dynamic motion of fire is used 
to distinguish fire from the objects around it. To estimate the relative distance of the blind and fire used 
Pinhole Model. Similar research has been conducted in dealing with fire dynamics. However, the researches 
only consider fire as a dynamic object while in this paper subject (camera) is also moving considerably. 
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2. RESEARCH METHOD 
This research comprises five main stages, which are data acquisition, image segmentation, image 
enhancement, fire detection, and distance estimation, as shown in Figure 1. 


Image Segmentation 


Data HSV Color RGB Color 
Acquisition Model Model 













Distance 
Estimation 






Figure 1. Fire detection system design 


2.1. Data acquisition 

In the data acquisition process, a point of view camera (Povie) is employed as the smartphone holder 
that keeps the camera in a stable and safe position for blind people. Povie is placed on the neck of the blind. 
Illustration of the data collection process using povie camera can be seen in Figure 2. The input data 
are video footage of fire, taken with 1920x1080 resolutions smartphone camera under the scenario of use by 
the blind. The video is recorded at 30 frames per second. The data are taken directly by recording fire at day 
and night conditions and in different places such as home yards, roadside, and open fields. The system will 
process the input video by reading each frame of the video sequentially. The frames have a resolution of 
1920 x 1080 with landscape orientation. These frames need to be preprocessed to improve the performance 
of the system. The preprocessing can be done by cropping the video with a ratio of 3:4 to change the 
resolution into 810 x 1080, then cropping the video with a factor scale of 0.592592. This process will modify 
the frames into portrait images with 480 x 640 resolution. This setting is intended to lower system computing 
time and to narrow down the field of view because the obstacles to be avoided are on the track of the blind. 








Figure 2. Data collection process using povie 


2.2. Image segmentation 
This research implements the image segmentation method, which is a process for identifying 
and distinguishing foreground from background content. Segmentation is used to partition an image into 
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homogeneous regions concerning some image features. Usually, the output of image segmentation is a binary 
image where the desired foreground is white (equal to 1), and the undesired background is black (equal to 0). 
Segmentation can also divide images into several parts based on the characteristics of the pixels. Commonly 
used image segmentation methods are thresholding, active contour, color segmentation, edge detection, 
watershed, and Hough transformation. Color segmentation uses a color model to segment objects from 
images. The most common color models are RGB and HSV. In this study, the segmentation phase is carried 
out with a two-stage cascade color model, i.e. HSV and RGB color models. The RGB color model 
is an additive color model, which red, green and blue light are mixed to reproduce a broad range of colors. 
The intensity of each color can vary from 0 to 255, which gives 16,777,216 different colors. To produce 
a new color, three colored light rays must be combined. Without intensity, each of the three colors is 
perceived as black, while full intensity displays white color. This color space describes colors in terms of their 
shade and their brightness value. 

The HSV Color Model defines colors in terms of hue, saturation, and value. Hue expresses true 
colors, such as red, violet, and yellow. Hue is associated with the wavelength of light. Saturation defines 
the purity of a color, which indicates how much white is given to the color. Value is an attribute that declares 
the amount of light received by the eye regardless of the color [22]. In the HSV segmentation stage, 
this research particularly uses only pixels that have a reddish color, which represents the color of fire. In HSV 
color model, fire color are within O - 0.9792 Hue, O - 0.6473 Saturation and 0.8118 - 1 Value. The color 
segmentation process is shown in Figure 3. Figure 3(a) displays the input fire image, and Figure 3(b) shows 
the results of segmentation with the HSV model. Next, the pixel filtering process is carried out in Figure 3(b) 
based on a predetermined range of HSV values. So that only the color of the flame will be obtained, 
as shown in Figure 3(c). However, there are still some objects that have a color similar to fire and fall into the 
range used, so that the second segmentation using the RGB color model is used to reduce this segmentation 
error. Figure 3(d) shows the results of the RGB color model segmentation. 





(a) (b) (c) (d) 


Figure 3. Color segmentation process, (a) The input fire image example, (b) HSV color model result, 
(c) The pixel filtering result, (d) RGB color model result 


2.3. The proposed algorithm for fire detection 

Fire constantly experiences size change and irregular movement. Due to this characteristic, blob 
analysis is implemented to analyze the change of the segmented object. Blob stands for a binary large object. 
The main advantages of this technique are its high flexibility and excellent performance. Its limitations 
are background-foreground relation requirement and pixel precision. The purpose of blob extraction is to 
locate the blob, which is the connected region corresponding to the objects being inspected in a binary image. 
A blob consists of a connected pixel group. In determining the blob value, several things must be known to 
produce optimal blob. The determination of a blob width on each object in the foreground segmentation 
process needs to be analyzed because the blob value for each object will be different. This is influenced by 
object features such as size, type, and video data retrieval techniques. The process starts from marking 
the foreground area which is considered an object, then collecting the data of the segmented area, such as 
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the position of the initial pixel, the length of the x-axis and the y-axis, and the area of pixels [23]. 
The proposed algorithm for fire detection is shown in the pseudocode: 
Analyze change in object area using blob analysis algorithm 


Input : frame (i) from morphology process 
Output : change in object area based frames and Fire Detection 


IF mod (i,10) == 0 //detection every 10 frames 
rangeMax = 0; rangeMin = 10000000; column = 1 //reset the range value every 10 
frames 
FOR skipCounter = i- 9: i- 1 
ChangeArea (column) = (abs(areaPerFrame(skipCounter+l) - //calculating the rate of 
areaPerFrame (skipCounter) )/areaPerFrame(skipCounter))*100; change of the fire object 
Column = column + 1; from the current frame and 
ENDFOR the next frame, up to 
Row = Row + 1; 10 frames 
ChangeValue = sum (ChangeArea, 2) 
IF ChangeValue >= 100 
warning = “Fire Detected” 
fire = fire + 1; 
detected = true; 
ELSE 
Warning = “Fire not Detected” 
notfire = notfire + 1; 
detected = false; 
ENDIF 
ENDIF 


The threshold value is determined based on the experiment using positive and negative data. 
Positive data are videos of actual fire and negative data are a video containing objects that have a similar 
color to fire. From the experiments, the positive data have an average change of size above 100, 
while the negative data video has an average size change below 100. So the used threshold value for 
detecting fire in this paper is 100. 


2.4. Distance 

After the fire has been detected, the distance estimation is calculated using the pinhole model. 
Pinhole model or pinhole camera is a method of taking pictures based on geometric projection between 3D 
points and 2D points that correspond to projections in the image. The Pinhole camera can project real-world 
images to a light-tight box or room with an aperture in the middle on one side of the box. The principle of 
pinhole cameras is simply described as follows. Objects from the landscape reflect light in all directions. 
The small size of the box hole only allows a small amount of light to pass and project the object upside down. 
The geometric mapping model of 3D space into 2D space is called perspective projection. The midpoint of 
perspective projection is the intersection of the optical / camera center and the line perpendicular to the image 
passing through the optical center also called the optical axis. While the point of image intersection with 
the optical axis called the main point [24], the illustration of the pinhole model can be seen in Figure 4 [25]. 
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Figure 4. Pinhole model 


Figure 4 shows the projection of a 3D object from the real world into a 2D plane through a hole. 
The diagonal length of the projected image (k) can be calculated easily with only information on the focal 
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length (f) and the diagonal length of the object (L) and the object distance from the camera (D). The view 
angle (y.) of the image is defined as the angle made by the diagonal of the picture. The pinhole model 
searches for the closest object to the camera by calculating the Y-axis coordinate value of the object. Then, 
the distance is calculated by comparing the coordinate object value in pixels and the reference value, which is 
a point in real-world that is used as the reference to find the estimated distance. The pixel references (d,e) and 
actual distance references (p,,.¢) can be seen in Table 1. 


Table 1. Actual distance and pixel references 


Ga (m) Pref (px) 

640 
618 
505 
439 
400 
374 
354 

339 


ONNNBWN 


Based on Table 1, an object with pixel distance (p) of 620 px from the Y-axis will use 618 px as 
the nearest pixel reference and 2m as the nearest actual distance reference. The actual distance (d) of 
the object is given by (1). The illustration of actual and image (pixel) distance comparison shown in Figure 5. 
The result of the fire detection system is determined based on the total change in object size. If it exceeds 
the predetermined threshold, the object will be detected as fire; otherwise, it will not be detected as fire. 
For objects that are positively detected as fire, a bounding box will appear and the estimated distance 
information will be displayed on the right corner of the screen. The example result is shown in Figure 6. 


d = dref XPref (1) 





Figure 5. Actual distance and pixel distance Figure 6. Output of the system 
comparison 


3. RESULTS AND DISCUSSION 
3.1. Fire detection 

The results of the fire detection system are expressed in percentage value of accuracy as shown in 
Figure 7. Video numbers of | through 21 are positive data, and the last 14 videos are negative data with 205 
frames on average, each video. The system yielded 88.86% average accuracy. Based on this value, 
the system can be considered reliable. However, the system performed poorly in some data scenarios due to 
some factors. For example, in video 33, the system produced only 51.77% accuracy. In this scenario, the 
system observed an accelerating vehicle at night time, in which the headlamp had similar color features to 
fire and appeared to be enlarged continuously. Hence, it exceeded the threshold, and the object was detected 
as a fire. The example of this case can be seen in Figure 8. The real implementation of this technology has been 
implemented in one of the blind foundation with 6 respondents from the foundation. The reflection of the fire must 
also be taken into consideration. The reflection can be detected as fire due to its color, therefore confused about 
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the system. This circumstance only occurred when the system was being used in dark or dim places, which example, 
as shown in Figure 9. 


Result of Fire Detection System 
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Figure 7. Result of the fire detection system 





Figure 8. Vehicle headlamp detected as fire Figure 9. System inability to detect fire that 
produced a reflection 


3.2. Distance estimation 

The result for the distance estimation system using the pinhole model is validated by calculating its 
mean square error (MSE). The result can be seen in Table 2. Based on real data, the average time needed by 
a blind person to reach a distance of 4 meters is 12.22 seconds. System validation is limited to a maximum 
distance of 4 meters. Because the camera captures 30 frames per second, and the system detects objects every 
ten frames. Hence, the system can detect objects every 0.33 seconds. Therefore, it can be concluded that it is 
safe enough for the blind with an average speed of 12 seconds to avoid an obstacle 4 meters ahead. 


Table 2. Result of the distance estimation 


Actual distance (m) Estimated distance (m) Square error 
2.00 2.00 0.0000 
2.50 220 0.0900 
3.00 2.99 0.0001 
3.50 3.13 0.0529 
4.00 4.19 0.0361 
MSE 0.0358 


The highest error is obtained when the object is 2.5 meters far, where the system estimated it as 2.2 
meters. The lowest error is 0, which occurred at a distance of 2 meters where the actual distance is 2 meters, 
and the estimated distance is 2 meters. The error occurred because there is a difference in camera angle 
between the system implementation and the data collection process. The difference in angle causes the distance of 
the system estimation to differ from the actual distance. Another factor that increases the error value is the height of 
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the camera during system usage. If the height of the camera when the system was being used differs significantly 
from the height when taking distance data, the scale value used to calculate the distance will also differ, which 
increases the difference in actual distance and estimated distance. 


4. CONCLUSION 

In this research, the color model and blob analysis were applied to detect fire, and then the pinhole 
model is used to estimate fire distance in dynamic scenarios designed for the visually impaired person. 
The input of the system is 35 videos with a resolution of 480x640 pixels, where 21 videos contained actual 
footage of fire, and the rest showed objects that have a similar color to fire. The fire detection system using 
blob analysis yielded 88.86% accuracy, while the distance estimation using the pinhole model performs well 
with an MSE value of 0.0358. In the future, the system can be modified to suit broader scopes, such as 
pothole detection and electric pole detection. 
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